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INTRODUCTION 


Several investigators have published data and theoretical arguments 
which suggest that a correlation exists between oceanic chlorophyll con- 
centration and the ocean's color spectrum. There is a need however, for 
a more quantitative description of the nature and variability of the 
relationship. In this paper are presented some statistical results and 
conclusions, based on direct comparisons of the ocean's color and chloro- 
phyll concentration, which may help to fill this gap. 

The ocean color and ground-truth data for this analysis were collected 
during Mission lUO of NASA's NP3A earth resources aircraft over the period 
from 6 through lU August 1970. During this experiment, chlorophyll and 
light attenuation data were collected by Oregon State University' s R/V 
CAYUSE and the chartered R/V JUDY K. All sets of comparative observations 
are simultaneous in the sense that the ship began sampling when the air- 
craft came overhead. 

This investigation is supported by the U. S. Naval Oceanographic 
Office (Spacecraft Oceanography Project) contract number N62306-70-C-0Ul . 

The use of a spectrometer and the valuable services of Mr. Richard 
Ramsey were provided, under subcontract, by TRW Systems Group, TRW INC., 
Redondo Beach, California. 

DESCRIPTION OF APPARATUS 


Ocean color spectra were observed with an off -plane, Ebert type, 
blazed diffraction grating spectrometer designed and built by TRW Inc. 

The spectral resolution of the spectrometer is between 5 and 7*5 nannometers. 
Approximately one second is required to observe one spectrum between 750 
and 1+00 nannometers, and approximately 3 seconds elapse after the start of 
one spectrum until the start of the next. 

The output from the TRW spectrometer was recorded both on a Sanborn 
strip chart recorder and on analog magnetic tape. The strip chart records 
were used for the present investigation. 

Pigment samples were collected by filtering water from Van Dorn sampling 
bottles, with subsequent handling as described in Strickland and Parsons 
(I965,pp1l7-127). 
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Light attenuation data were observed aboard the ships both using 
a submersible photometer (flat-plate detector) and a standard white secchi- 
disk. 


PROCEDURES AND METHODS OF ANALYSIS 


Water samples were taken, and light attenuation measured by the crew 
of each research vessel simultaneously with the NP3A aircraft's arrival 
overhead for a station. Aboard the aircraft, the TRW spectrometer was 
directed away from the sun's azimuth and tilted 15° from nadir to avoid 
the sun glitter pattern. On each station the aircraft flew two or more 
data runs, each ideally being about two miles in length centered about 
the ship's position. 


CHLOROPHYLL DATA PROCESSING 


Filtered pigment samples were frozen and subsequently analysed for 
chlorophylls a, b, and c concentrations by the methods described in Strickland 
and Parsons ( 196 ]*, pp117-127). The samples were also analysed for plant and 
animal car at eno id concentrations, but since these were uniformly less than 
1 mg/irr we have ignored caratenoids in this first analysis. 

The blue absorption bands of the three chlorophylls overlap considerably, 
and their distinctive red bands are overwhelmingly masked by water's aborp- 
tion of red light. Therefore, we have assumed that the effect of a given 
concentration of total chlorophyll (a + b + c) on water color will not depend 
on the proportions of a, b, and c in a first approximation. 

The effect of water and its contents on the daylight spectrum is a 
function of the pathlength followed by the light, i.e. upon some color 
producing depth to which light penetrates and is backscattered up into 
the atmosphere. It is intuitively reasonable to assume that a spectrometer 
responds to light backscattered from a color-producing depth about the 
same as the depth which a human eye can see. On this reasoning then, we 
have assumed color-producing depth to equal secchi-depth. Secchi-depth 
is the depth at which a large white disk may barely be seen by an observer 
above the water surface. 

Hence, the measure of chlorophyll concentration which we have attempted 
to relate to the ocean's color spectrum is Chlorophyll (a + b + c) averaged 
over secchi-depth. In the sequel we will refer to this quantity simply as 
"chlorophyll concentration" . 

Our chlorophyll data were collected at discrete depths, making it 
necessary for us to approximate a continuous distribution by assuming a 
linear trend between observations. This profile was then truncated at 
secchi-depth and averaged as though the data had been collected continuously. 
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OCEAN COLOR SPECTRA PROCESSING 


The strip charts from the TRW spectrometer system were calibrated 
for wavelength, and then digitized on-line to Oregon State University's 
CDC3300 computer. A program was developed to convert each spectrum into 
a vector of 70 irradiance components, each representing a 5-nm band between 
UOO and 750 nm. The same program corrected each spectrum for amplifier 
gain setting, dark-voltage error (a function of wavelength and slit number), 
variations in slit width, and calibrated photomultiplier response (a function 
of wavelength). 


Data Smoothing 


During the one second required to observe a single spectrum, the 
aircraft ' s forward motion caused the spectrometer to scan a strip of 
water about 100 meters long. This phenomenon caused each component of 
each spectrum to represent a distinctly different area on the ocean's 
surface. So if whitecaps are randomly distributed on the sea surface, 
or if small-scale random fluctuations in cloud cover' cause fluctuations 
in incident irradiance, fluctuations will occur randomly in wavelength 
in the observed upwelled light spectra. This process is illustrated in 
Figure I. 

Random fluctuations in cloud densities can, of course, occur on scales 
larger than 100 meters. This causes a variation in overall irradiance levels, 
but individual spectra are not contaminated internally. Variations in 
irradiance level make it impossible to directly compare different spectra 
though. 

The second noise effect is routinely compensated for by normalizing 
each spectrum with respect to irradiance observed at some reference wave- 
length, e.g. 570 nm. This practice is risky in the presence of small scale 
fluctuations, however, for we have no assurance that the normalization 
wavelength is not contaminated in some non-average way. To avoid this 
difficulty, we departed from custom and normalized each spectrum with 
respect to its own mean irradiance. 

Small scale whiteCap and cloud induced noise is not so easily smoothed 
over, but we have developed a satisfactory method of doing so. If we average 
within each wavelength band over several spectra, then each component of 
the average spectrum should contain an average amount of small scale noise. 

Occasionally the spectrometer viewed an unusually large whitecap, or 
a ship. Any such occurrence caused an anomalously large fluctuation that 
was not representative of the average conditions for the set of spectra 
being considered. The filtering problem we faced was to identify these 
anomalous fluctuations, and then to remove their effects from the smoothed 
data. 
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An "outlier" is an observation determined by some statistical cri- 
terion to belong to a population different from that to which the rest 
of the observations in the sample belong. Anomalously large fluctuations 
in our data were regarded as outliers, and the following iterative smoothing 
procedure was adopted: 

1. After each spectral component was expressed as a fraction of the 
mean for that spectrum, natural logarithms were taken of the data to 
improve the normality of distribution. After this transformation , 
the variance behaved linearly as a function of wavelength over the 
regions U20-5>80rtm and 580-700nm. 

2. For each wavelength band, the variances from all data inns for 
a given station were pooled. Then a least squares linear regression 
was performed £o relate variance to wavelength. The regression 
estimates of s (sample variance) were then used to compute, for each 
wavelength band, the statistic: 


max 1 1 - I ( 


where I is relative irradiance for a particular wavelength band and 
spectrum, and I is the mean for that wavelength band and data run. 
Critical values of this statistic are tabulated in Halperin, et al 
(l 955 ) • For simplicity we assumed 2 $ degrees of freedom for the 
regression estimate of s, even though we might argue for a larger 
number on the strength of the large effective sample size. 

3. When for any band 0+ exceeded the tabulated critical value, the 
rejected component was examined in the original, untransformed data 
matrix. The anomalous fluctuation associated with that outlier was 
then removed by substitution of values which made the offending spectrum 
behave locally in a manner consistent with the behavior of the rest 

of the spectra in the sample. This usually involved adjustment of 
a few data points adjacent to the outlier, even though these were 
not rejected on the basis of their own magnitudes. 

4. The entire process was repeated iteratively until no outliers 
were detected. The successive smoothing of background noise allowed 
discovery of outliers in secondary iterations that were undetectable 
against the noise level of the earlier iteration(s) . 

The final step in smoothing the ocean color spectra was to examine 
each data run for linear trends which might be attributable to horizontal 
gradients of chlorophyll concentration. Least squares regression coefficients 
were calculated for each wavelength band in each data run, and then these 
coefficients were smoothed by taking a simple moving average over wavelength. 
The smoothed regression coefficients were used, finally, to adjust the 
smoothed mean spectrum to correspond to the position in the data run occupied 
by the research vessel. 



105-5 


Principal Component Analysis 


For any sample of N-component vectors, the N N-component eigenvectors 
of the sample covariance matrix for the bases of an orthogonal coordinate 
system into which the original observations may be transformed without any 
loss of information. When extracted, this eigensystem will always be aligned 
so that the first eigenvector, e. . (j » 1,2,...N), defines the direction of 
maximum sample variance. Further, e„ . defines the direction of the next 
largest amount of sample variance orthogonal to e. ., and so forth with each 
successive eigenvector representing a direction or^lesser variance than 
any of its predecessors. Because of this property, if the sample variations 
occur in definite modes, i.e. if the original N-variables vary togethBr in 
definable ways, then most of the variation may be accounted for in terras 
of only the first few eigenvectors. 

An original observation vector X. (j ■ 1,2,....,N), may be transformed 
into its eigensystem representation through the k equations: 


Y. « e. . (X. - E.)j where j » 1,2,....,Nj i ■ 1,2, ....,k (l) 

1 J J 


and -where E. is th^sample mean vector, e^. is the i eigenvector, 
Y. is called the x n principal component or X.. Thus we effect the 
transformation: ^ 


and 


(X|,X 2 ,, 


’V 


(y 1 ,y 2 , < 


’V 


where k is selected less than or equal to N to retain the desired proportion 
of sample variance. When k is considerably less than N, we gain a significant 
reduction in the number of variables in exchange for a defined, and pre- 
sumably acceptable, loss of sample variation. 

Any original vector X. may be recovered from its principal component 
representation Y^ through dhe N equations 


X. 

3 




Y,e„ . 

1 1J 


Y 2 e 2j 


+ 


V* 


( 2 ) 


where j ■ 1,2, ,N. 

Morrison (1967) gives a thorough, yet readable, introduction to 
principal component analysis. Siraonds (1963) and Church (1966) discuss 
examples similar to the present application, i.e. analysis of data recorded 
in the form of a curve. This method is altemativel referred to as 
"principal component analysis", "eigenvector analysis", or "characteristic 
vector analysis" in the literature. 


It is possible to apply regression results obtained using the principal 
components of one sample to subsequently observed samples. Simply regard 
the mean vector of the original sample (E.) as the origin of the coordinate 
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system and apply equation (l ) , using the eigenvectors (e. .) obtained from 
the original sample* For example, to estimate chlorophyii concentration, 
using equation (3) below, from any observed ocean color spectrum, use the 
values of E., e, and e„ . given in Figure II. With these values determine 
Y.j and from equation ft) and substitute into equation (3). 

Multiple Regression Analysis 


Linear least squares multiple regression analyses were performed 

using standard methods. The first "k" principal components Y. were treated 

as independent variables, with alternatively chlorophyll concentration (C) 

or secchi-depth (Z ) as dependent variable, yielding equations of the form 
s 

0 . A + 8,1, * B 2 Y 2+ ♦ \\ 


2 

The squared multiple correlation coefficient (R ) was calculated to 
estimate the percentage of variability in the dependent variable accounted 
for by the regression equation. 


We also calculated, as a measure of precision, the residual standard 
deviation 


S 

c.y 


n 

m=1 


(C - 8) 2 

jn 

n - q 




where C and C are respectively the observed value and regression estimate 
of the dependent variable, n is the number of observations, and q ■ k + 1 
is the total number of variables in the regression equation. 

Finally, simultaneous confidence limits were calculated for the regression 
coefficients. When such a confidence interval contained zero, it was 
regarded as strong evidence of no significant correlation between the 
associated independent variable and the dependent variable, allowing 
that term to be dropped from the equation. 


DISCUSSION OF RESULTS 


A principal component analysis was performed on the sample of 31 
trend-adjusted mean color spectra, each having 55-components between ii20 
and 695nm. The mean vector (E.) and first two eigenvectors (e. i ® 1,2; 
j ■ 1,2, , 55 ) are given in J Figure II. 

The first eigenvector (e 1 .) accounts for 75 # of sample variance, and 
the second eigenvector (e^ .) accounts for an additional 20#. Thus, 95# 
of the total variance in tne sample of 31 55-component color vectors 
( X 1 » X 2 > ,X 55^ iS con ^ ained 31 pairs (Y ,Y 2 ). 
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CHLOROPHYLL CONCENTRATION 


A regression analysis was performed using ln(0.508 + C) as the dependent 
variable, and Y^ and Y 2 as independent variables, with the result 

In (0.508 + C) - 1.0085 - 0.51U9 Y 1 + 1.2790 Y 2 (3) 

where C is chlorophyll concentration in mg/m , and the arbitrary constant 
0.508 is the ratio of the average specific absorption coefficients of water 
and chlorophyll. Equation (3) is significant at the 0.005 level and accounts 
for 77$ of the -.variability in -.the observed chlorophyll data (over the range 
from-0.00 mg/nr to 8.U3 mg /nr). The residual standard deviation was ±1 .62 
mg/m . These results are collected in Table I and illustrated in the scatter 
diagrams of Figure III. In Figure III note that residual deviation is 
measured from the plane of the two regression lines; a glance at only one 
marginal distribution can be misleading. 


SEC CHI-DEPTH 


Ocean color is produced by wavelength selective processes acting over 
color producing depth, which we have assumed to be equal to secchi-depth. 
Therefore we expect a strong correlation between secchi-depth and the 
ocean color spectrum. 

As a first step we plotted the scatter diagrams shown in Figure IV. 
These graphs suggest a partitioning of the sample, with classification 
according to which research vessel collected the ground truth data. This 
distinction is a result of the different ocean environments in which the 
two vessels operated. The JUDY K operated exclusively within a few miles 
of the mouht of the Columbia River, where we may assume uniformly high 
densities of suspended particles. The CAYUSE on the other hand, operated 
in a more oceanic regime well away from the river mouth, where we may 
assume relatively low densities of suspended particles. We have tentatively 
and arbitrarily described the two environments as respectively a particle- 
scattering dominated and an absorption dominated ocean color system. 

A multiple regression analysis of secchi-depth (Z ) versus Y^ and Y 2 
for the particle-scattering dominated subsample yielded c 


Z / \ ** 5»U83 + 1.768 Y, meters (U) 

s(p) 1 

where the regression coefficient of Y ? was shown to not be significantly 
different from zero by simultaneous confidence intervals. Equation (U) 
accounts for only 55$ of the variability in secchi-depth, but the sample 
range was very narrow (2.5 to 6.5 meters). The residual standard deviation 
is only * 1 meter. These results are collected in Table Ila and illustrated 
in Figure IVa. 
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A similar analysis of the absorption dominated subsample yielded 


Z s(a) ° 9,2lil " 7 * 833 Y 2 meters (5) 

where the regression coefficient of Y was shown to be not significantly 
different from zero by simultaneous confidence intervals. Equation ( 5 ) 
accounts for 82$ of the variability in secchi-depth over a range from 6 to 
20 meters. Residual standard deviation is ± 2.1 6 meters. These results 
are collected in Table lib and depicted in Figure IVb. 


INTERPRETATION AND APPLICATION 


Through joint use of equations 3 through 5 it is possible to estimate 
chlorophyll concentration averaged over secchi-depth, estimate secchi-depth 
(i.e. the depth of the layer in which we are estimating chlorophyll), and 
say something about particle concentrations (at least qualitatively). 

For routine application of these results, it should be possible to 
estimate Y* and from irradiance measurements in three narrow wavelength 
bands (i^Ig,!^). By solving the three equations 


In I. 

J 


In I 


E. 

J 


Y.e, . 
1 1J 


Y 2 e 2j 


for Y and Y g (in I is also unknown, hut it need not be solved for explicitly), 
estimates may be obtained for substitution in equations (3) through (5). 

Values for E. , e . and e„ . are to be taken from Figure II for the appropriate 
wavelengths. J 'J J 

For example, if irradiances are measured in 3 lOnm-wide wavelength 
bands centered on l*95nm ( j=1 ), 51*7. 5nm (j => 2) and 602.5nm ( j<=3), then 

Y 1 " 0.300 ^ ln ^1^ + 

Y 2 " 0.208 [ ln f^^]- °’ 33C *j 

which is nearly as simple as attempting to measure chlorophyll from only 
a single measured ratio of irradiances. 


CONCLUDING REMARKS 


We have shown by empirical means that it is possible to measure chlorophyll 
concentration in the ocean as a linear function of the first tiro principal 
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components of the ocean's color spectrum. The residual standard deviation 
of the estimate is about - 1.6 mg/nr . 

The chlorophyll concentration estimated is the average over secchi-depth, 
which may also be estimated: a) within about 2$% in very turbid waters as 

a linear function of Y. , and b) within about in relatively clear ocean 
waters as a linear function of 

For routine application of these results, Y. and Yg may be estimated 
easily from irradiance measurements in only three narrow wavelength bands. 

A 3-channel measurement will not suffice for future investigations aimed 
at broadening or improving these results, however. A full spectrum analysis 
by methods similar to those presented here is recommended for that task. 

The relatively poor fit to the chlorophyll data (- 1 .6 mg/m?) suggests 
that we next conduct a covariance analysis to examine the joint effects of 
chlorophyll, particle scattering and secchi-depth on ocean color. 
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FIGURE III (b) 


TABLE I 

CHLOROPHYLL CONCENTRAT ION and the PRINCIPAL COMPONENTS OF OCEAN COLOR 
MULTIPLE REGRESSION ANALYSIS RESULTS 

In (0.508 + C) - 1.0085 - 0.51L9 Yj ♦ 1.2790 Y 2 

Significant at the 0.005 level. 

2 

R ■ 0.77 (percent of variance in chlorophyll concentration 
accounted for by the regression equation) 

R • 0.88 

^ ■ - 1.62 *g-chlA 3 (etasd&rd deviation from regression plane) 

S m 19J 0 f sample range. 

c.y 

Simultaneous Confidence Limits on Regression Coefficients: 

97.5* Cl on ?, : -0.9U87 < (?, < -O.08II 

97.5* Cl on O.LL87 < (* % < 2.1093 

C • Sum of chlorophylls a, b, and c averaged over secchi depth, nfcich is as corned equal to 
color producing depth. 
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FIGURE IVb 
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